Identification and validation of a major QTL for kernel length in bread wheat based on two F3 biparental populations

Background High yield and quality are essential goals of wheat (Triticum aestivum L.) breeding. Kernel length (KL), as a main component of kernel size, can indirectly change kernel weight and then affects yield. Identification and utilization of excellent loci in wheat genetic resources is of great significance for cultivating high yield and quality wheat. Genetic identification of loci for KL has been performed mainly through genome-wide association study in natural populations or QTL mapping based on genetic linkage map in high generation populations. Results In this study, an F3 biparental population derived from the cross between an EMS mutant BLS1 selected from an EMS-induced wheat genotype LJ2135 (derived from the hybrid progeny of a spelt wheat (T. spelta L.) and a common wheat) mutant bank and a local breeding line 99E18 was used to rapidly identify loci controlling KL based on Bulked Segregant Analysis (BSA) and the wheat 660 K single-nucleotide polymorphism (SNP) array. The highest ratio of polymorphic SNPs was located on chromosome 4A. Linkage map analysis showed that 33 Kompetitive Allele Specific PCR markers were linked to the QTL for KL (Qkl.sicau-BLE18-4A) identified in three environments as well as the best linear unbiased prediction (BLUP) dataset. This QTL explained 10.87—19.30% of the phenotypic variation. Its effect was successfully confirmed in another F3 population with the two flanking markers KASP-AX-111536305 and KASP-AX-110174441. Compared with previous studies and given that the of BLS1 has the genetic background of spelt wheat, the major QTL was likely a new one. A few of predicted genes related to regulation of kernel development were identified in the interval of the detected QTL. Conclusion A major, novel and stable QTL (Qkl.sicau-BLE18-4A) for KL was identified and verified in two F3 biparental populations across three environments. Significant relationships among KL, kernel width (KW) and thousand kernel weight (TKW) were identified. Four predicted genes related to kernel growth regulation were detected in the interval of Qkl.sicau-BLE18-4A. Furthermore, this study laid foundation on subsequent fine mapping work and provided a possibility for breeding of elite wheat varieties. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08608-3.

mode is one of the reasons for the above phenomena [1]. Spelt wheat (T. spelta L.), one of the hexaploid wheats, is an archaic cereal with the primitive genomes related to bread wheat. Spelt wheat has high nutritional compositions, and its microelement content is higher and more abundant than common wheat [2,3]. Spelt and bread wheat have the same genome (AABBDD), and their genetic distance is comparatively small, which is helpful for the production of stable cross breeds [4]. Compared with ancestral wheat species, the phenotypic variation of kernel traits in the modern germplasm pool is significantly reduced [5]. Therefore, it is of important significance to excavate new loci controlling kernel traits in wheat breeding. Kernel length (KL), as one of the main components that constitute kernel size, is a complex quantitative trait controlled by multiple genes. It can indirectly change kernel weight and ultimately affect yield [6]. Furthermore, genetic and phenotypic structures support variations in kernel size and shape. Kernel size increases gradually through changes in KW and KL, and in the later stage, changes in kernel shape are mainly realized through changes in KL [5]. For example, TaGL3-5A was co-located with a significant QTL for KL. Correlation analysis revealed that the TaGL3-5A-G allele was significantly correlated with longer KL and higher thousand kernel weight (TKW) [7]. TaGS-D1 is associated with kernel weight and KL in common wheat [8]. Thus, it is extremely important to identify and utilize loci associated with KL in wheat gene resources for cultivating wheat with high yield and quality.
According to the extreme differences of individual phenotypes in the offspring population produced by a pair of parents with related traits, two gene banks were constructed by screening and collecting DNA samples for of Bulked Segregant Analysis (BSA). This method usually provided a convenient and quick method for identifying markers for genomic regions associated with target traits [18,19]. In previous studies, PCR-based molecular markers, such as amplified fragment length polymorphism (AFLP) and simple sequence repeats (SSR), had been generally used for gene mapping, but these markers usually take a long time to construct the map, and the density of map cannot meet the demand of fine mapping [20]. SNPs are important resource of polymorphic markers and can be used for gene mapping in the genome of any living organism. In the process of genotyping and marker-assisted selection, the wheat 660 K SNP array is an accurate, economical and dependable option [21].
To our knowledge, many reports have shown that using early generations of wheat to rapidly conduct major and stable QTL analysis combined with BSA and the wheat 660 K SNP array. For example, on the basis of BSA and the wheat 660 K SNP array in F 1 , F 2 , and F 2:3 populations, YrZl31 conferring stripe rust resistance was mapped on chromosome 2BL [20]. Qyryac.nwafu-2BS, a novel QTL for adult plant resistance to stripe rust was identified on 2B using BSA and 660 K SNP array in an F 2:3 population [22]. The above method has been widely used to identify wheat resistance genes, but rarely applied in quantitative traits. QTL related to KL in wheat have been detected on almost each of the chromosomes, but they were usually shown as micro-effect and unstable. Most of them were detected in a single population and not verified in diverse backgrounds. Therefore, it is necessary for wheat breeding to excavate and identify major and stable QTL for KL.
In this study, we employed early generations of wheat population in combination with BSA-660 K SNP array to detect genetic differences between two pools with extreme phenotypes of KL. And we further identify and validate major and stable QTL for KL across different environments base on a linkage map.

Plant materials
Two F 3 biparental populations derived from the crosses between three wheat genotypes: BLS1 and 99E18 (BLE18, the mapping population) and between BLS1 and Sumai3 (BLSM3, the validation population), comprising 237 and 178 lines, respectively, were used in this study. BLS1 is a genetically stable mutant (M5 generation) selected from an EMS-induced wheat genotype LJ2135 mutant bank. LJ2135 was a genetically stable line and is derived from the hybrid progeny of a spelt wheat (T. spelta L., LS5893) and a common wheat (JM6893), Given its relatively better agronomic traits including appropriate plant height and large spike, LJ2135 was selected to be a breeding parent. The KL of BLS1 can reach 8 mm under conventional cultivation conditions. The lines 99E18 and Sumai3 show shorter KL than BLS1 ( Fig. 1a and Figure S1), and both have excellent disease resistance [23], especially Sumai3, which is a classic wheat variety with famous fusarium head blight resistant gene, Fhb1 [24].

Phenotypic evaluation
In October of 2020, the BLE18 and BLSM3 populations were grown in three different environments including Wenjiang (103°51ʹE, 30°43ʹN), Chongzhou (103°38ʹE, 30°32 ʹN), and Ya'an (103°0ʹE, 29°58ʹN) of Sichuan Province in China. Each line was planted in a single 1.5 m row  0.3 m between rows, and 15 kernels were sown in a row with 0.1 m between kernels within a row. The field management followed the local practices of wheat production. After the harvested kernels were dried at 37 °C in an oven until a constant dry weight was obtained [25], 36 full and uniform kernels mainly from the spikelets located in the middle of the spike of different plants in each line were selected and scanned by the Epson Expression 10,000 XL flatbed color image scanner (Seiko Epson Corporation, Japan) and further analyzed by WinSEE-DLE (Regent Instruments Canada Inc) for measurements of KL and KW. Cell length and width of kernel pericarp from three parents were observed and measured using the Quanta 450FEG scanning electron microscope. Different lines were characterized phenotypically in different environments. Totally, 239, 218 and 137 lines of BLE18 were measured in Wenjiang, Chongzhou, and Ya'an, respectively. Notably, some lines with poor-quality kernels especially in Ya'an were not used for further analysis, and the statistics of lines used for kernel investigation in Ya'an were listed in Table S1. Hundred kernel weight (HKW) was obtained by randomly weighting 100 kernels per line through an electronic balance with the precision of 0.01 g, and TKW calculated as 10-folds of the average HKW with three replicates.

BSA, Wheat 660 K SNP array analysis, and KASP markers development
The combination of BSA and the wheat 660 K SNP array was performed to identify SNPs between two parents and two phenotypically contrasting pools. The 30 lines in each of two relative extreme mixing pools were selected through the following steps: (1) The phenotypic data of KL of BLE18 in three environments were arranged in a descending order. The number of 40 lines with the maximum and minimum values in the three environments were respectively taken and recorded. (2) Average values of each line from three-environments phenotypic data of KL were calculated and recorded. (3) The intersection of the numbers obtained in (1) and (2) was screened, and 30 lines with the maximum and minimum values were finally selected. The equal amounts of genomic DNA from 30 lines with the shortest KL (10 kernels per line), 30 lines with the longest KL (10 kernels per line) and two parents were extracted using the CTAB method [26]. The DNA were further used for the wheat 660 K SNP array analysis in Beijing CapitalBio Technology Co, Ltd. Two parent pools were used to exclude interference of false positive markers. For example, one marker showing polymorphic between two phenotypically contrasting pools, but not between parents will be excluded.
The polymorphic SNPs between two extreme mixing pools associated with the target QTL were converted into Kompetitive Allele Specific PCR (KASP) markers to construct linkage map. The sequences of primers were designed as per the KASP primer design manual [27], and the primer sequences are listed in Table S2. The allele-specific forward primers were designed carrying FAM (5′-GAA GGT GAC CAA GTT CAT GCT-3′) and HEX (5′-GAA GGT CGG AGT CAA CGG ATT-3′) at the 5′ end. In this study, the KASP amplification reaction was conducted in volume of 10 μl including 5 μl of 1 × Sso-Fast EvaGreen mix (Bio-Rad, Hercules, CA, USA), 1.4 μl of mixture forward and reverse primers, 3.1 μl of deionized water and 0.5 μl of 50-80 ng/ μl DNA. PCR cycling was performed using the following procedure: hot starting at 94 °C for 15 min, 10 touchdown cycles (94 °C for 20 s; touchdown 60 °C, drop 0.6 °C per cycle, for 60 s), and then by 25 cycles of amplification at 95 °C for 20 s, and 55 °C for 60 s.

Genotyping and genetic map construction
A total of 33 KASP markers were developed to detect SNP polymorphism between two parents, and the effective polymorphic markers were further used to genotype BLE18 population. The results of markers' classification were applied to construct the genetic linkage map with the Kosambi function format in JoinMap 4.0. The sequences of two flanking makers were used to blast (E-value of 1e-5) against the genome assembly of IWGSC RefSeq v2.1 (http:// 202. 194. 139. 32/ blast/ blast. html) to obtain the physical locations [28]. The linkage map was drawn by MapChart 2.2.

Data analysis and QTL mapping
Phenotypic variation, frequency distributions, and Pearson's correlation coefficient were implemented using SPSS 22 (IBM SPSS, Armonk, NY, USA). The best linear unbiased prediction (BLUP) dataset of KL under three environments were calculated using SAS V8.0 (SAS Institute, Cary, NC, USA; https:// www. sas. com). The BLUP were calculated based on the following model: where f is a fixed-effects vector, Xi is an incidence vector, a i is the value of phenotype and e i is the environmental deviation [29]. Based on the genetic linkage map and kernel phenotypic data in three environments as well as the BLUP, we identified the QTL for KL using IciMapping 4.1 with the Inclusive Composite Interval Mapping (ICIM) setting the LOD threshold ≥ 2.5 [30]. In addition, IciMapping 4.1 was used to analyze QTL × environment (QE) interaction, and the pre-adjusted parameters were as follows: step = 1 cM, PIN = 0.001 and LOD = 2.5. When the Percentage Variation Explained (PVE) of a given QTL was greater than 10% and can be detected repeatedly in multiple environments, it was considered as a major and stably expressed QTL. QTL was named in accordance with the International Rules of Genetic Nomenclature [16]. For example, the investigated QTL was designated as Qkl. sicau-BLE18-4A. In detail, 'sicau' means 'Sichuan Agricultural University'; the uppercase 'Q' indicates 'QTL'; 'kl' represents kernel length, the abbreviation of the trait in this study; 'BLE18' indicates the mapping population; '4A' , represents the chromosome of this QTL.

Validation of the major QTL
After obtaining the initial result of QTL mapping, the flanking markers were used to verify the effect of this QTL in another genetic background to further determine its stability, authenticity, and reliability. Based on the marker profiles, the population was divided into two categories: lines with homozygous alleles from either parent BLS1 or Sumai3 (excluding heterozygous lines). Then student's t-test (P < 0.05) was utilized to detect significant difference between phenotypic data of two different types of lines.

Comparison of QTL for KL on chromosome 4A
To confirm whether the major QTL detected in this study is a novel locus, the sequences of flanking KASP markers (KASP-AX-111536305 and KASP-AX-110174441) for Qkl.sicau-BLE18-4A and those for previously identified QTL were utilized to perform blast searches against the reference genome sequence of T. aestivum cv. Chinese Spring (IWGSC RefSeq v2.1). Then, we compared the physical intervals to determine whether they overlap.

Orthologous alignment
The flanking markers of the major QTL in this study were further aligned with the physical map of Chinese Spring (IWGSC RefSeq v2.1) and wild emmer (T. turgidum ssp dicoccoides, WEWSeq v2.0) to get orthologous genes. These genes were further analyzed for gene annotation and function on UniProt (http:// www. unipr ot. org/) [28,31]. Furthermore, to identify the possible candidate genes of KL, the relative expression levels of the genes identified in the interval of Qkl.sicau-BLE18-4A were analyzed on the Triticeae Multiomics Center website (http:// 202. 194. 139. 32/ expre ssion/ wheat. html) and obtained through the wheat expression database of Chinese Spring cv-1 Development (single) [32].

Phenotypic assessment and correlation analysis
Descriptive statistics for KL, KW and TKW of parents and two populations are presented in Table 1. Compared with BLS1, the KL values of 99E18 were significantly lower, but the KW values were significantly higher (Table 1and Fig . 1a). The KL and TKW values of BLS1 were significantly higher than Sumai3, except for individual environment ( Table 1). The frequency distributions of KL showed approximate normal distributions in BLE18 across different environments (Fig. 2). The ranges of KL, KW and TKW were 6.53-8.98 mm, 3.17-4.29 mm, and 29.70 -75.60 g, respectively, in BLE18 population. In all environments of BLSM3, KL ranged from 6.65 to 9.51 mm, KW from 2.68 to 4.24 mm and TKW from 34.57 to 67.40 g (Table 1). Both cell length and width of the kernel pericarp were significantly greater in BLS1 than in 99E18 and Sumai3, in agreement with the relationship of KL among three parents (Fig. 1b, c, d). The positive correlations for KL were detected among three environments with coefficient ranging from 0.52 to 0.55 in BLE18, and 0.43 to 0.56 among three environments in BLSM3 (P < 0.01, Table S3). The correlations among the three kernel traits were all significant for each other in BLE18 across multiple environments, with coefficient ranging from 0.17 to 0.79 (P < 0.05, Table 2). Significant and positive correlations were observed among KL, KW and TKW in BLSM3 across three environments, with correlation ranging from 0.03 to 0.76. Especially, there were extremely significant correlations (P < 0.01) between TKW and KL or KW in three environments, and the correlation coefficients ranged from 0.38 to 0.76 (Table 2).

BSA and Wheat 660 K analysis
After genotyping with the wheat 660 K SNP array, the number of homozygous polymorphic SNPs between the extreme pools and two parent pools were confirmed, resulting in a total of 764 SNPs, and 446 of them (58.4%) were observed on chromosome 4A (Fig. 3a). Most of the SNPs on 4A were within an interval of 90-160 Mb (Fig. 3b). The number of other SNPs unequally distributed across other chromosomes were ranging from 2 to 77. These results preliminarily determined that a locus controlling KL was most likely located on chromosome 4A.

Genetic map construction and QTL mapping
A total of 33 KASP markers were developed based on polymorphic SNPs between two parents and two phenotypically contrasting pools on chromosome 4A, and were further used to genotype the population BLE18. Combined with the genotyping results of different markers, the genetic linkage map was constructed with a length of 28.9 cM (Fig. 3c).
A stable QTL (Qkl.sicau-BLE18-4A) for KL was detected in three environments as well as BLUP dataset. It was mapped between KASP-AX-111536305 and KASP-AX-110174441, with an interval of 2 cM (Table 3 and Fig. 3c). It explained 10.87-19.30% of the phenotypic variation, with LOD values ranging from 3.01 to 6.41. The positive allele of Qkl.sicau-BLE18-4A was contributed by the parent BLS1, and the KL values of the lines carrying BLS1 alleles were significantly higher than those carrying 99E18 ones (Fig. 3h). No QTL for KW or TKW was identified on this chromosome. Further analysis also showed that this major QTL had no direct effect on KW and TKW ( Figure S2).

Validation of the major QTL for KL
The flanking markers tightly linked to Qkl.sicau-BLE18-4A were used to verify the effects of Qkl.sicau-BLE18-4A in BLSM3 across different environments. The polymorphism between BLS1 and Sumai3 was detected using KASP-AX-111536305 and KASP-AX-110174441. According to the genotyping results, the population were divided into two categories. The KL of the lines carrying positive alleles from BLS1 was significantly higher than that of the lines without positive alleles (P < 0.05, Fig. 4), and the differences between the two categories ranged from 3.28% to 4.70% in the validation population across three environments.
The homozygous lines of parental alleles 'BLS1' and ʻSuami3' at Qkl.sicau-BLE18-4A were selected in accordance with the genotyping data of KASP-AX-111536305 and KASP-AX-110174441 for BLSM3 population. T-test showed that the KW and TKW of the lines carrying the 'BLS1' alleles were not significantly different from those of the lines without positive alleles in BLSM3 population across different environments, which further confirmed that Qkl.sicau-BLE18-4A had no genetic effect on KW and TKW ( Figure S3).

Table 1 Phenotype of the parents and two populations in different environments
Environment was defined with year and location. WJ, Wenjiang; CZ, Chongzhou; YA, Ya'an; BLUP, the best linear unbiased prediction; SD, standard deviation; CV, variation coefficient; /, data missing; **Significant at P < 0.01 between the two parents

Qkl.sicau-BLE18-4A is a novel and stable QTL
Here, a major and stably expressed QTL, Qkl.sicau-BLE18-4A related to KL was detected in wheat between KASP-AX-111536305 and KASP-AX-110174441 on chromosome arm 4AS. In addition, its genetic effect was further validated in another verification population (Fig. 4), which suggested its reliability and stability.
Qkl.sicau-BLE18-4A was physically located between 101.74 Mbp and 109.25 Mbp on 4AS of Chinese Spring. To the best of our knowledge, only a few reports have documented QTL related to KL on 4A, and most of them were located on the long arm of 4A (4AL), such as QKl-4A.1, qKL-4A, QKl.sdau-4A, QKL.sicau-4A, QGl.ccsu-4A.1 and QKl.sau-4A.1 [11,15,[33][34][35][36][37][38] (Table S5). In addition to the QTL mentioned above, we found some QTL on 4AS in previous reports. For example, Q37, a minor QTL, was located in the physical position of 47 Mbp, and the flanking marker Xgwm397 of Q37 was located at 47.3 Mbp [39]. QGl-4A was in the physical interval from 120.0 to 140.4 Mbp on 4AS and mapped into a 0.37 cM genetic interval [40]. QKl.sau-4A.2 was co-located between 41.7 and 60.3 Mbp on chromosome 4AS of CS reference genome [38]. None of the above three genes overlapped with the interval of Qkl.sicau-BLE18-4A. In addition, as BLS1 is a mutant from an EMS-induced wheat genotype that was derived from the hybrid progeny of a spelt wheat and a common wheat. We think Qkl.sicau-BLE18-4A from this mutant may be different from previously reported ones, and should be  Table 2 Correlation of different kernel traits in BLE18 and BLSM3 populations across different environments KL kernel length (mm), KW kernel width (mm), TKW, thousand kernel weight (g), Environment was defined with year and location. WJ, Wenjiang; CZ, Chongzhou; YA, Ya'an; **Significant at P < 0.01, *Significant at P < 0.05  [3]. Hybrids of common wheat and spelt wheat provide a positive selection by eliminating undesirable quality and improving nutritional value of bread wheat [41]. Thus, the lines carrying positive allele at Qkl.sicau-BLE18-4A should have breeding value given its spelt wheat genetic background.

Analysis of the predicated genes in the interval of Qkl. sicau-BLE18-4A
We attempted to predict candidate genes for the iden-  (Table S6 and Fig. 3e, f ). For 33 orthologs, three previously reported genes might be related to kernel traits, and their relative expression levels (http:// 202. 194. 139. 32/ expre ssion/ wheat. html) were relatively high in kernels ( Figure S4). For example, TraesCS4A02G094300 encodes phosphatase 2C, a negative regulator of abscisic acid signaling [42]. TraesCS4A02G094500 encodes tropinone reductases which function at the branch point of tropane alkaloid metabolism [43]. TraesCS4A02G095500 encodes an adenyl-nucleotide exchange factor activity that was related to signal transmission [44].
Interestingly, the relative expression levels of TraesC-S4A03G0195700 and TraesCS4A03G0196500 were the highest in the kernel compared with other tissues (leaves, spike and root). The expression levels of TraesC-S4A03G0193400 and TraesCS4A03G0193900 in kernel were not the highest, but their expression levels in kernels were almost the same as other tissues ( Figure S4). In summary, the above four genes are the emphasis in our subsequent research of fine mapping and gene cloning.

Correlation between kernel traits
In this study, KL, KW and TKW were positively correlated in BLE18 across three environments. Notably, the correlation between KL and TKW, as well as between KW and TKW were positively significant (Table 2), which was consistent with previous studies [15,34,[45][46][47].
Additionally, we should notice that when analyzing effects of a QTL for a given trait on other traits, loci of these corresponding traits in lines harboring the given QTL should be excluded. Otherwise, real effect of a given QTL may be covered and cannot be detected. In our present study, only one major QTL for KL (Qkl.sicau-BLE18-4A) was detected. We also did not detect any QTL for KW and TKW in this interval, suggesting that loci for these two traits may be located in other intervals or chromosomes. Therefore, in our analysis of effects of Qkl. sicau-BLE18-4A on KW and TKW, the loci for KW and TKW mapped on other intervals may interfere the detection ability, resulting in no significant difference in KW or TKW between lines with the allele from BLS1 and that from 99E18 for Qkl.sicau-BLE18-4A. We thus suggest that genome-wide analysis of loci for KL, KW and TKW could factually reveal their genetic correlations. Anyway, the major and novel QTL for KL identified in this study could be useful for dissecting mechanism of KL in the future and could have potential in breeding. Aggregating Qkl.sicau-BLE18-4A with other loci/genes controlling KW may have significant effects on TKW. The kernels are developed from the carpel during wheat flowering, and then the endosperm cells begin to divide and expand, following by the accumulation of starch and protein. KL and KW determined the kernel morphology. Larger kernel size was related to higher kernel weight [48].Thus, among many factors that constitute kernel weight, KL and KW play a vital role that cannot be ignored.

The feasibility of QTL mapping of quantitative traits using BSA and wheat 660 K SNP arrays in low generation populations
BSA has been widely used in wheat genetic mapping, and it is one of the effective methods to rapidly obtain target genes [18,49,50]. For species with large and complex genomes like wheat, it is a huge and laborious project to complete genome sequencing. Therefore, more convenient, faster and accurate methods are needed to improve the credibility of the results in mapping of wheat traits. So far, many studies have been using BSA combined with various sequencing methods to achieve gene localization [51,52]. For example, the combination of BSA and the wheat 660 K SNP array has been widely used for localization of interest genes especially for disease resistant loci. A major QTL (Qyrnap.nwafu-2BS) for stripe rust resistance was identified on chromosome arm 2BS following BSA and the wheat 660 K SNP array analysis in an F 2:3 population [19]. The early leaf senescence gene Els2 was mapped in an F 2 population using BSA and the wheat 660 K SNP arrays [53]. However, a few studies had shown that BSA-660 K SNP arrays can be used to identify QTL for agronomic traits in low generations of wheat. Recently, a total of 48 QTL controlling spike-related traits located on 18 chromosomes were identified using BSA and the wheat 660 K SNP array in genetic analysis of F 2 and F 2:3 populations [40]. Another example is that a novel reduced height gene was localized on chromosome arm 2DL using BSA-660 K SNP arrays in four F 2 segregating populations [54]. In this study, two low-generation (F 3 ) populations were used to rapidly identify and validate QTL related to KL. This method greatly improved the efficiency and shortened the time. Collectively, our results combined with previous studies suggested that QTL mapping of agronomic traits using BSA and the wheat 660 K SNP array in combination with linkage map analysis in low generation populations is a rapid and effective method.